Very large vocabulary voice dictation for mobile devices
نویسندگان
چکیده
This paper deals with optimization techniques that can make very large vocabulary voice dictation applications deployable on recent mobile devices. We focus namely on optimization of signal parameterization (frame rate, FFT calculation, fixedpoint representation) and on efficient pruning techniques employed on the state and Gaussian mixture level. We demonstrate the applicability of the proposed techniques on the practical design of an embedded 255K-word discrete dictation program developed for Czech. Its real performance is comparable to a client-server version of the fluent dictation program implemented on the same mobile device.
منابع مشابه
User Expectations from Dictation on Mobile Devices
Mobile phones, with their increasing processing power and memory, are enabling a diversity of tasks. The traditional text entry method using keypad is falling short in numerous ways. Some solutions to this problem include: QWERTY keypads on phone, external keypads, virtual keypads on table tops (Seimens at CeBIT ‘05) and last but not the least, automatic speech recognition (ASR) technology. Spe...
متن کاملDesign and development of voice controlled aids for motor-handicapped persons
In this paper we present two voice-operated systems that have been designed for Czech motor-handicapped people to allow them full access to computers and computer based services. The programs, which are named MyVoice and MyDictate, are complementary in their functions. Both employ ASR engines developed in our lab. The former is used primarily as a midsize-vocabulary (up to 10K words) voice comm...
متن کاملDynamic lexicon for a very large vocabulary vocal dictation
For very large vocabulary vocal dictation systems, we present a decoding strategy useful to reduce the lexical decoding cost. For each test-utterance, a sub-lexicon is selected from a very large recognition vocabulary. Such a recognition sub-lexicon is called Dynamic Lexicon (DL). Various algorithms of DL selection are developed and tested in terms of coverage rate of textual corpus. From these...
متن کاملTranslating On the Go? Investigating the Potential of Multimodal Mobile Devices for Interactive Translation Dictation
This article provides a general overview of interactive translation dictation (ITD), an emerging translation technique that involves interacting with multimodal voice-and-touchenabled devices such as touch-screen computers, tablets and smartphones. The author discusses the interest in integrating new techniques and technologies into the translation sector, provides a brief description of a rece...
متن کاملMobile, L2 vocabulary learning, and fighting illiteracy: A case study of Iranian semi-illiterates beyond transition level
As mobile learning simultaneously employs both handheld computers and mobile telephones and other devices that draw on the same set of functionalities, it throws open the door for swift connection between learners and teachers. This study examined and articulated the impact of the application of mobile devices for teaching English vocabulary items to 123 Iranian semi-illitera...
متن کامل